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Abstract 

In this paper the author proposes to use the Least Squares Lattice filter with forget- 
ting factor to estimate time- varying parameters of the model for noise processes. We 
simulated an Auto-Regressive (AR) noise process in which we let the parameters of 
the AR vary in time. We investigate a new way of implementation of Least Squares 
Lattice filter in following the non stationary time series for stochastic process. More- 
over we introduce a modified Least Squares Lattice filter to whiten the time-series 
and to remove the non stationarity. We apply this algorithm to the identification of 
real times series data produced by recorded voice. 
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1 Introduction 



In the application of optimal filtering for the detection of a signal buried in 
noise, often it is useful the procedure of whitening [1,2]. If the noise in which 
the signal is hidden is non stationary or non Gaussian noise we cannot apply 
anymore the optimal filter for stationary and Gaussian noise. We focused 
this work on the application of Least Squares Lattice [4,5,13,14] algorithm to 
the problem of identification of non stationary noise in stochastic process, to 
the procedure of whitening and to the possibility of making stationary a non 
stationary process. 

To this aim we apply this algorithm to some toy models we built using an 
autoregressive non stationary model (see section 5) [3,12,10]. In section 2 we 
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show the whitening techniques based on a lattice structure, in section 3 we 
introduce the adaptive Least Squares methods and its application to simulated 
non stationary data. In sections 5 and 6 we show how it is possible to whiten 
the data and to eliminate the non stationarity present in the noise. We apply 
this algorithm to simulated and real data. 



2 The autoregressive model and the whitening 



An Auto- Regressive process x[n] of order P with parameter a^, from here after 
AR(P), is characterized by the relation 



p 

x[n] = CLkx[n — k] + aw[n] 
fc=i 



being w[n] a white Normal process. 

The problem of determining the AR parameters is the same of that of finding 
the optimal "weights vector" w = Wk, for k = 1, ...P for the problem of linear 
prediction [3]. In the linear prediction we would predict the sample x[n] using 
the P previous observed data x[ra] = {x[n — l],x[n — 2] . . . x[n — P]} building 
the estimate as a transversal filter: 

p 

%[ n \ = w kx[n - k] . (2) 
fe=i 



We choose the coefficients of the linear predictor by minimizing a cost function 
that is the mean squares error e = £[e[n] 2 ] {£ is the operator of average on 
the ensemble), being 

e[n] = x[n] — x[n] (3) 



the error we make in this prediction and obtaining the so called Normal or 
Wiener-Hopf equations 



r X x[0]-J2 w k r xx[-k], (4) 
k=i 



which are identical to the Yule-Walker equations [3] used to estimated the 
parameters from autocorrelation function with w fc = — a k and e min = a 2 
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Fig. 1. Whitening filter and AR filter. 

This relationship between AR model and linear prediction assures us to obtain 
a filter which is stable and causal [3], so we can use the AR model to reproduce 
stable processes in time-domain. It is this relation between AR process and 
linear predictor that becomes important in the building of whitening filter. 

The tight relation between the AR filter and the whitening filter is clear in 
the figure 1. The figure describes how an AR process colors a white process at 
the input of the filter if you look at the picture from left to right. If you read 
the picture from right to left you see a colored process at the input that pass 
through the AR inverse filter coming out as a white process. 

When we find the P parameters that fit a PSD of a noise process, what we are 
doing is to find the optimal vector of weights that let us reproduce the process 
at the time n knowing the process at the P previous times. All the methods 
that involve this estimation try to make the error signal (see equation (3) ) a 
white process in such a way to throw out all the correlation between the data 
(which we use for the estimation of the parameters). 




3 LSL filter 



The Least Squares based methods build their cost function using all the in- 
formation contained in the error function at each step, writing it as the sum 
of the error at each step up to the iteration n : 

n 

e[n]=J2^ l e 2 (t\n), (5) 
i=i 
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Fig. 2. Lattice structure for LS filter. 



being 



N 



e(i\n) = d\i] - J2 Xi- k w k [n], 

k=l 



(6) 



where d is the signal to be estimated, x are the data of the process and w the 
weights of the filter. The forgetting factor A lets us tune the learning rate of 
the algorithm. This coefficient can help when there are non stationary data 
in the time series and we want the algorithm has a short memory. If we have 
stationary data we fix A = 1, otherwise we choose < A < 1 

There are two ways to implement the Least Squares methods for the spectral 
estimation: in a recursive way (Recursive Least Squares or Kalman Filters) 
or in a Lattice Filters using fast techniques [5]. The first kind of algorithm, 
examined in [1], has a computational cost proportional to the square of the 
order of filter, while the cost of the second one is linear in the order P. 

The computational cost of RLS is prohibitive for an on line implementation. 
Moreover its structure is not modular, thus forcing the choice of the order P 
once for all. The algorithm with a modular structure like that of the lattice 
offers the advantages of giving an output of the filter at each stage p, so in 
principle we can change the order of the filter by imposing some criteria on its 
output. The Least Square Lattice filter is a modular filter with a computational 
cost proportional to the order P. 

In the least squares methods the linear prediction is made for a vector of data 
x[n], so the natural space where developing these methods are the vectorial 
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spaces (a detailed insight in these techniques is reported in [5]). 

Let X be a Hilbert p— dimensional space, to which the vectors x[n] of acquired 
data belong. The p vectors Xj [n] of length n obtained as time translation of 
length p of the vector x[n] 

Xl [n] =z- 1 x[n] = (0,0,x[l],...,a;[n-l]) , 
x 2 [n] =z- 2 x[n] = (0,0,0,a;[l],...,x[n-2]) , 

Xp[n] = z-Px[n] = (0, 0, 0, ... , x[l], x[n - p\) 

form a base of this space. A vector u which belongs to this space can be 
written as 

p 

u[n] = «fc x kH • (7) 
k=i 

The vector x p+1 [n] with last component x[n] does not belong to this space, but 
to a vectorial p + 1-dimensional space D. In the problem of linear prediction 
the best estimation of desired signal d[n], that is x p+ i[n], is obtained using 
the vector lying in the space X. 

Therefore the least squares methods look for a vector x[ra] which is the closest 
to the vector d[n], by minimizing the norm of the distance between x[n] and 
d[n]. It can be shown that this operation corresponds to the projection of the 
vector d[n] from the p + 1-dimensional space D in the p-dimensional sub-space 
X by a projector P. We can decompose the vector d[n] as sum of the vector 
x[n] and a vector which has null component only along the vector orthogonal to 
the space X. This vector is the vector e{n\n) which, by definition, is orthogonal 
to the data vector x[n]. In fact the orthogonal vector to the space X can be 
obtained as 

(I - P)d[n] = d[n] - Pd[n] = d[n] - x[n] = e(n|n) . (8) 



Therefore the vector d[n] belongs to the vectorial space D, direct sum of the 
sub- space X and of sub-space E defined by the vector e{n\n) 

D = X©E. (9) 



For the LS adaptive algorithm we want to write the quantities we need for the 
estimation of x[n] at the iteration n by means of the quantities at the iteration 
n — 1 and if we use a modular structure the same quantities at the stage p in 
terms of the ones at the stage p — 1. 
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Using the described techniques, if we augment the order of the filter from p to 
p + 1, we must write the new projector P p +i as function of the operator P p . 
The new vectorial space will be the direct sum of X of dimension p and of the 
1-dimensional sub-space orthogonal to X along which there is e(n\n) and the 
projector will be 

P p+1 = P p + P!, (10) 



where we wrote Pi to point the projector on the one-dimensional space _L X. 

If we add a new data x[n] to the space X(n — 1), we introduce the vector 7r[n] 
orthogonal to the space X(n — 1) and the new projection of the signal d[n] 
along X(n) will be 



P[ra]d[n] = P[n - l]d[n - 1] + P ff[n] d[n] = (11) 

x[n - 1] + P T [ n ] d[n] . 

Then we can write in a matrix form the relation (11) 
, P\n - 11 , 

P[n] = | | • (12) 

1 



A useful parameter which is introduced is the angle 7 P [n — 1] between the two 
sub-spaces X[n — 1] and X[n] which can be obtained from the relation 

7 >-l]=<7r[n] ! P>Mn]>, (13) 



where we introduced the scalar product <, > between the two vectors a[n] 
b[n] defined as 

<a[n],b[n] >= £ A n - k u[k]v[k] , (14) 

k=l 



and P p [n] —I — P p [n]. Let us remember that A is the forgetting factor; if we 
limit ourselves to A = 1 the scalar product <, > is simply a T b. 

We can write an adapting relation for the projector in term of the vector 7 



Pp[n-1] 
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It is important to note that, thanks to the properties of the projector, the 
number of operation per iteration is now proportional to the order P and not 
to P 2 as for the RLS algorithm. 

The LSL filter is a lattice filter characterized by recursive relation between the 
forward error (FPE) and the backward one (BPE). With the new notation we 
can write 

e p [n]=x[n]-x[n] = [I-P p [n]]x[n]. (16) 



The scalar error e^[n] can be written as the component along the direction 
7r[n] of the vector perpendicular to the sub-space X[n — 1] 

e£=<7r[n],4[n] > . (17) 



In a similar way we can write the backward errors. For the backward errors 
the space where we make the prediction is different from the sub-space X[n] 
because the base is now given by Aln],..^-^]. If we introduce the 
projector P p _i on this new base we can write 

eJ[n] = [I-P p _ 1 [n]]z-Px[n] . (18) 

The scalar backward error is given by 

e J[n]=<7r[n],e5[n]> . (19) 



The square sum for the forward and backward errors can be written as 



*] = <e p [n],e p [n]> (20) 
e b p [n] = <e b p [n],e b p [n]> (21) 

Now we can write the recursive relations for the projectors in the equations 
(16) and (18) 



el +1 [n] = e f P [n}+k b p+1 [n}e b p [n-l] , (22) 

el +1 [n] = e b p [n-l}+k f p+1 [n}e f P ln] , (23) 

where we introduced the forward and backward k p reflection coefficients 
defined by 



k b p+1 [n] = 



<';,>>- i 



(24) 
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The adaptive implementation of the LSL filter (22) (23) requires the updating 
with order p and time n of the reflection coefficients. 



So we must write the recursive relation for the quantities e£[ra], e b p [n] and 
A p+1 [?i] =< z^e^n], e p [n] >. This can be done using the updating formula 



A p+ i[n] 



AAp+i[n- 1] + 



e\\n- l]e f p [n] 



(26) 





[n-l] 



, a p + iN 



(27) 



(28) 



7p+ih - 1] 



7 P [n - 1] 



Kin - l]] 2 
e*[n-l] 



(29) 



The error at the last stage is the whitened sequence of the input data. So 
at the output of LSL filter we find the reflection coefficients we can use for the 
estimation of the AR parameters for fit to PSD of the time series. Moreover 
one of the output of this filter is the whitened sequence of data. 

The procedure described for the implementation of LSL filter is the so called 
aposteriori procedure [4]. Since this algorithm involves division by updated 
parameters, we must be careful in avoiding division by values too small. In 
the aposteriori procedure, the reflection coefficients are estimated indirectly 
by the estimation of and e b . 

In the apriori implementation, the reflection coefficients are estimated directly 
by the forward and backward errors. 

In the apriori implementation the recursive relation for the parameters are 
given by 




\4-i[n - 1] + Tp-ie^-iWe^Jn] , 
A4_Jn - 1] + Ip-ie^M - ljej.jn - 1] , 
ei-iM+A^n-lJeJ.Jn-l] , 



(30) 
(31) 
(32) 
(33) 



kl [n\ = kl [n - 1] j, r i , (34) 



€ 



7 P -ie£_i[n]eJ[ra] 



fc^ffin-l]- '^y 1 ' ;7 pL " J , (35) 

7^i|e;_ 1 [n-l]| 2 
7 P = 7p-i 3— T-i • (36) 

(37) 

Since the apriori implementation is more stable with respect the aposteriori 
one, we choose to use the apriori recursive relation for the LSL filter to perform 
tests on non stationary data. 



Modification of LSL filter output to remove non stationarity in 
the data 



In the above relations no one of the parameters gives a direct estimation of the 
a of the guiding white noise process for the AR model. We have to estimate 
it by using the quantities or e b . In particular if we suspect to have non 
stationary data in which the overall RMS of the process changes in time we 
have to estimate it step by step. The relation we used to estimate a in LSL 
filter is: 

a[n] = yj4fa]/P, (38) 



being P the order we choose for the filter and consequently for the AR fit to 
the PSD. 

Moreover we have to normalize this quantity with respect to the number of 
iterations we used to estimated it. If A = 1 we used all the data to achieve the 
converging value for cr[n], so we have to divide this value at the step N, by 
the length N of the time series. If we used A < 1 we have to normalize by the 
window of data we used that is equal to j^j, called the memory of the filter. 

If the a varies in time we well find a a that follows the changes, choosing a 
good value for the parameter A. 

The novelty in our algorithm is the introduction of a normalization of the 
output of the whitening filter in such a way to make the process white and 
stationary. We accomplish this task by estimating the a[n] at each step n and 
the by diving the output e^[n], which is our whitened time series, by a[n\. 
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Table 1 

Modified LSL algorithm 



Parameter and variable descriptions 

I: time index 

p: index on filter stage 

x[l]: input data sequence 

y[l]: whitened data sequence 

Main Loop 



for I = 1,2,. ..N 
4[l] = 4[l]=x[l] 
e f [l]=e b [l] = Xe f [0]+x 2 [l] 
7 [0] = 1.0 
for p= 1,2, P 

4-M = X 4-^ n ~ !] +7 P -ieJ_i[n]e^_ 1 [n] 
4-1 M = H~ii n ~ !] + 7 P -ie{_i[n - lje^jn - 1] 
e£[n] = e^_ 1 [n] + fc/[n - l]e^_ x [n - 1] 
e b p [n]=e b p _ 1 [n-l]+k b p [n-l]e f p _ 1 [n] 



for p= 1,2,...P 

e£[0] = fc/[0] = *$[0] = 7 P = 1.0 

4[o] = ^4[°] = 5 



being 5 a value close to the average amplitude of the process. 



7p ~~ 7p— i 



fcp [n] = kp[n — 1] 
k b [l] = k b [n-l}- 




end 




end 



Initializations at Z = 0: 
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If we do not normalize the output of the whitening filter by the varying a we 
will have white non stationary data, but if we divide the output by a we will 
find white stationary data, that are what we need in applying optimal signal 
search filter for Gaussian and stationary data. In the next section we show 
the results of the application of this modified version of LSL algorithms to 
simulated non stationary data. 



5 LSL: application to non stationary noise data 

5. 1 Toy model I: varying the parameter 

We build an AR process of order P = 2 simulating a power spectral density in 
which one resonance is present and we let varying the frequency of the peak 
changing in time the value of one of the two parameter. The value we use are 



— LSL estimates 
simulated 




Fig. 3. Simulated al(t) and LSL estimate 

the following: 

A(l) = 1.3 A(2) = -0.9 a = 1.0 (39) 



choosing a sampling frequency of 200 Hz. 
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Fig. 4. Time series x[n] and LSL whitened one 

Moreover we let A(l) vary in time with the following law: 

A(l,t) = A(l)exp(asin(2nut + </))) , (40) 



with the values = lu = 0.1 Hz and a = 0.2. 

We fit this process using LSL filter and whiten the data, using A = 0.99. In 
figure 3 we show the simulated time varying parameter and the estimated 
one. In figure 4 we show the simulated time series and the output of the LSL 
whitening filter. 

In these data it is one of the parameters of the AR model which changes its 
value in time, so when we estimate the reflection coefficients from the data 
we find also this variation in time if we use this forgetting factor < 1. So the 
division of the output of the process by the estimated a doesn't influence the 
whitening of the data, since the values of the a is constant in time. 

We check these results also by plotting the PSD at the input and at the output 
of the modified LSL filter and we plot the in figure 5. In this figure the peak 
of the PSD results broadened due to the moving of the main resonance, while 
after the application of the LSL algorithm with forgetting factor < 1 the PSD 
becomes flat, and also the non stationarity disappears. 
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Fig. 5. Power spectral density of the process x[n] and LSL whitened one 
5.2 Toy model II .-varying the a 

We simulated an AR noise process in which the a of the guiding white normal 
noise changes in time with the following function: 

a{t) = a (1.0 + (a sin(27rwt)) , (41) 

We use an AR(2) model with the same initial values of previous toy model,and 
the following values for the modulation 

a = 0.4 u) = 0.005Hz. (42) 



In this case it is crucial the division by the estimated a of the process if we 
want to have at the output of whitening filter stationary data. In fact if we 
plot the estimated a of the LSL filter with A < 1, we find that, even if in a 
noisy way, the estimation follows the variations in time of the simulated a (see 
figure 6). 

If we plot the output of the standard implementation of the LSL whitening 
filter in figure 7, it is evident the this filter has reduced the total RMS of the 
data, but it has not removed the modulation of the sigma of the process. 

If we apply the modified LSL filter, as it evident in figure 8, we succeed in 
removing also the modulation of the data and in having a stationary whitened 
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Fig. 6. Simulated time varying a[n] and LSL estimated a with A = 0.99 
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Fig. 7. Simulated time series and not modified LSL whitening output 
time series. 

As it is clear in figure 9 the whitening obtained with the modified LSL algo- 
rithm is good and the whitened PSD for non stationary data results flat. 
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Fig. 8. Time series x[n] and modified LSL whitened one 
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Fig. 9. PSD of simulated non stationary AR(2) and modified LSL whitened one 
6 A realistic case: whitening the voice 

In order to see the application of this algorithm on real data we perform a 
test on a recorded speech sound, that we convert in a time series. This is 
surely a non stationary time series, and we want to test if our algorithm is 
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Fig. 10. Voice time series in time domain and whitened ones. The time series was 
fitted as an AR(100) model and we use forgetting factor A = 0.92 in the LSL 
algorithm 

able to identify the speech and to remove all the features present in it. This 
will also mean that we can be able to reconstruct the speech from the learned 
parameters [9]. In figure 11 we report the voice time series in time domain 
and the outputs of the standard and LSL algorithm, using an order P = 100 
for the filter and a value A = 0.92 for the forgetting factor. If we do not apply 
the modification of the LSL filter, we succeeded in whitening the PSD (see 
figure 12), but non in removing the non stationarity, as is clear in figure 10. 
In figure 11 we reports the results of the modified algorithm. The whitening 
results good and the variation of the RMS in time has disappeared. 

In figure 13 we superimpose the estimated a[n] on the voice time series to show 
that most of the variation in time of the voice are due not to the variation of 
the AR parameters but to the variation of the a of the AR process. 

These tests show that we have a powerful method to identify non stationary 
process and to whiten them. If we have the estimation of the parameters in 
time we can reconstruct step by step the original time series. 



7 Conclusion 

We build a whitening filter using a modified LSL algorithm to remove the non 
stationarity present in the RMS of the driving white noise for a simulated AR 
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Fig. 11. Voice time series in time domain and whitened ones with modified algo- 
rithm. The time series was fitted as an AR(IOO) model and we use forgetting factor 
A = 0.92 in the LSL algorithm 
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Fig. 12. PSD for recorded voice, standard and modified LSL whitened ones with 
order P = 100 and forgetting factor A = 0.92. 
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Fig. 13. Voice time series in time domain and estimated a[n]. 

process. We find that with this algorithm is able to follow the non stationarity 
either coming from the AR parameters or from the a parameter and to remove 
them from the original data set. 

This kind of implementation could be useful if we want to deal with stationary 
and white data and we have to apply an optimal filter for signal detection. 

Moreover we test this algorithm on speech signal, finding encouraging results 
on the identification of speech, so this application could be useful also in the 
reconstruction of speech phoneme. 
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